Skip to content

Fix Python from_dict() round-trip for optional fields with schema defaults#1313

Open
stephentoub wants to merge 3 commits into
mainfrom
stephentoub/jubilant-meme
Open

Fix Python from_dict() round-trip for optional fields with schema defaults#1313
stephentoub wants to merge 3 commits into
mainfrom
stephentoub/jubilant-meme

Conversation

@stephentoub
Copy link
Copy Markdown
Collaborator

Fixes #1139.
Fixes #1140.
Fixes #1141.

Why

from_dict(to_dict(x)) was not the identity for three generated Python dataclasses whenever the affected optional field was None. The dataclass declares the field as T | None = None and to_dict() omits it when None, but from_dict() was substituting the JSON-Schema default value when the key was missing, silently mutating unset fields:

  • SessionTaskCompleteData.summary: None""
  • PermissionPromptRequest.action: NonePermissionPromptRequestMemoryAction.STORE
  • PermissionRequest.action: NonePermissionRequestMemoryAction.STORE

Any code that round-trips events (caching, replay, audit logs, dedup) or pattern-matches on None to mean "absent" was getting the wrong answer end-to-end. The bug was reproducible on main as of today.

What changed

Root cause is in scripts/codegen/python.ts: for optional fields with a schema default, the codegen was emitting obj.get("key", "<default>"), then passing that into from_union([from_none, parse_X], ...), which happily parsed the default and returned it. The dataclass default and the wire default were inconsistent, so the round-trip broke.

  • Dropped the defaultLiteral mechanism from both emitPyClass and emitPyFlatDiscriminatedUnion. from_dict() now uniformly emits obj.get(key), so missing/null keys flow through from_none and land at the dataclass-declared None. Removed the now-unused toPythonLiteral helper.
  • Regenerated python/copilot/generated/session_events.py via npm run generate:python. Exactly three call sites changed, all matching the bug reports.
  • Flipped the two codegen-level assertions in nodejs/test/python-codegen.test.ts that previously locked in the buggy output, and added negative assertions to prevent regression at the generator level.
  • Replaced test_schema_defaults_are_applied_for_missing_optional_fields in python/test_event_forward_compatibility.py (which asserted the bug as expected behavior) with two regression tests covering missing-key parsing, explicit-null parsing, and the full from_dict(to_dict(x)) round-trip for all three affected classes.

Surface area / non-obvious notes

  • Other languages (Go, .NET, Rust, TypeScript) are not affected. Their generators never read propSchema.default to construct deserialization fallbacks; they use real nullable types end-to-end.
  • Grep on the post-fix generated session_events.py and rpc.py confirms zero remaining obj.get("...", "...") patterns, so no other class was silently relying on baked-in schema defaults.
  • The test_schema_defaults_are_applied_for_missing_optional_fields test was the only consumer in the repo of the old behavior. Nothing in runtime code depended on request.action == STORE or task_complete.summary == "" for unset fields.
  • JSON Schema default is an annotation, not validation behavior, so applying it only during deserialization while ignoring it in the dataclass field and in to_dict() was internally inconsistent. This change makes Python deserialization symmetric with serialization and with the in-memory default.

Validation

  • python -m pytest test_event_forward_compatibility.py -> 10/10 pass (including new regression tests).
  • Broader Python suite: 136/136 non-E2E tests pass.
  • npx vitest run on python-codegen.test.ts and other non-CLI test files: pass.
  • ruff check, ruff format --check, ty check copilot, tsc --noEmit, eslint: clean.
  • Manual reproduction of each issue's repro script on unmodified main (changes stashed) confirms the bug; with the fix applied, all three round-trip with equal? True.

Independent review by the code-review agent on claude-opus-4.7-xhigh found no significant issues and confirmed both codegen paths and the test coverage.

Copilot AI review requested due to automatic review settings May 16, 2026 21:57
@stephentoub stephentoub requested a review from a team as a code owner May 16, 2026 21:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes Python generated event-model deserialization so that missing optional fields no longer silently substitute JSON-Schema default values, restoring from_dict(to_dict(x)) identity when those fields are None (addressing #1139, #1140, #1141).

Changes:

  • Updated Python codegen to stop emitting obj.get(key, <schema default>) fallbacks during from_dict() generation and always use obj.get(key).
  • Regenerated python/copilot/generated/session_events.py to remove the three problematic defaulted obj.get(...) call sites.
  • Updated generator-level (Node) and Python regression tests to lock in correct missing-key / explicit-null / round-trip behavior.
Show a summary per file
File Description
scripts/codegen/python.ts Removes schema-default fallback emission in from_dict() generation (both class and flat discriminated-union paths).
python/copilot/generated/session_events.py Regenerated output removing defaulted obj.get(..., default) for the affected optional fields.
nodejs/test/python-codegen.test.ts Adjusts assertions to expect obj.get(key) and adds negative assertions preventing regression.
python/test_event_forward_compatibility.py Replaces the prior test that encoded the buggy behavior with regression tests for missing/null parsing and from_dict(to_dict(x)) round-trip.

Copilot's findings

  • Files reviewed: 3/4 changed files
  • Comments generated: 0

@github-actions github-actions Bot mentioned this pull request May 16, 2026
@stephentoub
Copy link
Copy Markdown
Collaborator Author

@brettcannon would you mind looking at this? Is it the right fix?

@github-actions

This comment has been minimized.

@brettcannon
Copy link
Copy Markdown
Contributor

Is it the right fix?

Yep, it's the right fix for the problem.

Copilot AI added 3 commits May 19, 2026 14:03
…aults

Fixes #1139, #1140, #1141.

The Python codegen was embedding JSON-Schema `default` values into
`obj.get(key, default)` for optional fields. But the generated
dataclass field is always `T | None = None` and `to_dict()` omits
the field when `None`, so `from_dict(to_dict(x))` silently mutated
unset fields into the schema default:

- `SessionTaskCompleteData.summary`: `None` -> `""`
- `PermissionPromptRequest.action`: `None` -> `MemoryAction.STORE`
- `PermissionRequest.action`: `None` -> `MemoryAction.STORE`

Drop `defaultLiteral` from both `emitPyClass` and
`emitPyFlatDiscriminatedUnion` so `from_dict()` always uses
`obj.get(key)` (matching the dataclass default). Regenerate
`session_events.py`. Flip the two codegen-level test assertions that
previously locked in the buggy output and add negative assertions.
Replace `test_schema_defaults_are_applied_for_missing_optional_fields`
(which asserted the bug as expected behavior) with regression tests
covering missing-key parsing, explicit-null parsing, and full
`from_dict(to_dict(x))` round-trips for all three affected classes.

Other languages (Go, .NET, Rust, TypeScript) are unaffected; their
generators never read `propSchema.default` for deserialization
fallbacks.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The CI prettier check failed on test/python-codegen.test.ts after the assertion update. Apply prettier --write to bring the file back into compliance.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@stephentoub stephentoub force-pushed the stephentoub/jubilant-meme branch from de1684d to fa08e9d Compare May 19, 2026 18:22
@github-actions
Copy link
Copy Markdown
Contributor

Cross-SDK Consistency Review ✅

This PR fixes a Python-only deserialization bug in the codegen pipeline — no cross-SDK consistency issues found.

What changed: The Python codegen (scripts/codegen/python.ts) was incorrectly applying JSON Schema default values via obj.get(key, default), causing from_dict(to_dict(x)) round-trips to silently mutate None optional fields for SessionTaskCompleteData.summary, PermissionPromptRequest.action, and PermissionRequest.action.

Why other SDKs are unaffected: Verified that Go (scripts/codegen/go.ts), .NET (scripts/codegen/csharp.ts), Rust (scripts/codegen/rust.ts), and TypeScript (scripts/codegen/typescript.ts) codegen scripts contain no propSchema.default or equivalent fallback-default deserialization logic — they rely on proper nullable types end-to-end and never had this pattern. A grep across all codegen scripts for the removed defaultLiteral pattern returns zero results, confirming the fix is cleanly isolated.

Verdict: The fix restores internal consistency within the Python SDK (making deserialization symmetric with serialization and with in-memory defaults) without requiring any changes to other language SDKs.

Generated by SDK Consistency Review Agent for issue #1313 · ● 181.2K ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants